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Abstract 

Ensembles of random stochastic and bistochastic matrices are investigated. While all columns of a ran- 
dom stochastic matrix can be chosen independently, the rows and columns of a bistochastic matrix have to 
be correlated. We evaluate the probability measure induced into the Birkhoff polytope of bistochastic ma- 
trices by applying the Sinkhorn algorithm to a given ensemble of random stochastic matrices. For matrices 
of order N - 2 we derive explicit formulae for the probability distributions induced by random stochastic 
matrices with columns distributed according to the Dirichlet distribution. For arbitrary N we construct an 
initial ensemble of stochastic matrices which allows one to generate random bistochastic matrices according 
to a distribution locally flat at the center of the Birkhoff polytope. The value of the probability density at 
this point enables us to obtain an estimation of the volume of the Birkhoff polytope, consistent with recent 
asymptotic results. 
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1 Introduction 

A stochastic matrix M is defined as a square matrix of size N, consisting of non-negative elements, such that the sum 
in each column is equal to unity. Such matrices provide an important tool often applied in various fields of theoretical 
physics, since they represent Markov chains. In other words, any stochastic matrix maps the set of probability vectors 
into itself. Weak positivity of each element of M guarantees that the image vector p' = Mp does not contain any negative 
components, while the probability is preserved due to the normalization of each column of M. 

A stochastic matrix B is called bistochastic (or doubly stochastic) if additionally each of its rows sums up to unity, so 
that the map preserves identity and for this reason it is given the name unital. Bistochastic matrices are used in the theory 
of majorization [1-3] and emerge in several physical problems [4]. For instance they may represent a transfer process at 
an oriented graph consisting of N nodes. 
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The set Sn of bistochastic matrices of size N can be viewed as a convex polyhedron in R (A, ~ 1)_ . Due to the Birkhoff 
theorem, any bistochastic matrix can be represented as a convex combination of permutation matrices. This (N - l) 2 
dimensional set is often called Birkhoff polytope. Its volume with respect to the Euclidean measure is known [5-7] for 
2 < N < 10. 

To generate a random stochastic matrix one may take an arbitrary square matrix with non-negative elements and renor- 
malize each of its columns. Alternatively, one may generate independently each column according to a given probability 
distribution defined on the probability simplex. A standard choice is the Dirichlet distribution (14), which depends on 
the real parameter s > and interpolates between the uniform measure obtained for 5=1 and the statistical measure for 
s = 1/2 — see e.g. [8]. 

Random bistochastic matrices are more difficult to generate, since the constraints imposed for the sums in each column 
and each row imply inevitable correlations between elements of the entire matrix. In order to obtain a bistochastic matrix 
one needs to normalize all its rows and columns, and this cannot be performed independently. However, since the both 
sets of stochastic and unital matrices are convex, iterating such a procedure, converges [9] and yields a bistochastic matrix. 
Note that initializing the scheme of alternating projections with different ensembles of initial conditions leads to various 
probability measures on the set. 

The aim of this work is to analyze probability measures inside the Birkhoff polytope. In particular we discuss methods 
of generating random bistochastic matrices according to the uniform (flat) measure in this set. Note that the brute force 
method of generating random points distributed uniformly inside the unit cube of dimension (N - 1 ) 2 and checking if the 
bistochasticity conditions are satisfied, is not effective even for N of order of 10, since the volume of the Birkhoff polytope 
&n decreases fast with the matrix size. 

The paper is organized as follows. In Section 2 we present after Sinkhorn [10] two equivalent algorithms producing a 
bistochastic matrix out of any square matrix of non-negative elements. An implicit formula (13) expressing the probability 
distribution in the set of bistochastic matrices for arbitrary N is derived in Sec. 3.1, while exact formulas for the case 
N - 2 are presented in Section 3.2. Furthermore, we obtain its power series expansion around the center B* of the 
Birkhoff polytope and for each N we single out a particular initial distribution in the set of stochastic matrices, such that 
the output distribution is flat (at least locally) in the vicinity of B^. Finally, in section 5 we compute the value of the 
probability density at this very point and obtain an estimation of the volume of the set of bistochastic matrices, consistent 
with recent results of Canfield and McKay [12]. In Appendix A we demonstrate equivalence of two algorithms used to 
generate random bistochastic matrices. The key expression of this paper (36) characterising the probability distribution 
for random bistochastic matrices in vicinity of the center of the Birkhoff polytope is derived in Appendix B, while the 
third order expansion is worked out in Appendix C. 

2 How to generate a bistochastic matrix? 
2.1 Algorithm useful for numerical computation 

In 1964 Sinkhorn [10] introduced the following iterative algorithm leading to a bistochastic matrix, based on alternating 
normalization of rows and columns of a given square matrix with non-negative entries: 

Algorithm 1 (rows/columns normalization) 

1) take an input N x N stochastic matrix M such that each row contains at least one positive element, 

2) normalize each row-vector of M by dividing it by the sum of its elements, 

3) normalize each column-vector as in the previous point 2), 

4) stop if the matrix M is bistochastic up to certain accuracy in some norm || ■ ||, otherwise go to point 2) . 



Random bistochastic matrices 3 



The above algorithm is symbolically visualized in Fig. 1 . For an initial point M one may take an arbitrary matrix with 
non-negative entries. To fix the scale we may assume that the sum of all entries is equal to N, so M belongs to interior of 
the (N - 1) dimensional simplex A N i_ t . The transformation/? of normalization of the rows of M produces a unital matrix, 
for which the sum of all (non-negative) entries in each row is equal to unity. Subsequent normalization of the columns of 
R{M) maps this matrix into the set of stochastic matrices. This step can be rewritten as C — TRT, where T denotes the 
transposition of the matrix. Hence the entire map reads II := CR = (T o R) 2 . For instance if N = 2 in the limit we aim to 
get a bistochastic matrix 




bistochastic 



Figure 1 : Sketch of the iteration procedure: a matrix M consisting of non-negative entries is sent by the transformation R 
(normalization of rows) into the set of unital matrices, and then by the transformation C (normalization of columns) into 
the set of stochastic matrices. Iterating the map II = (T o R) 2 one arrives at a bistochastic matrix Moo. 

Since both these sets are convex, our procedure can be considered as a particular example of a general construction 
called 'projections on convex sets'. Due to convexity of these sets the procedure of alternating projections converges to a 
point belonging to the intersection of both sets [9]. An analogous method was recently used by Audenaert and Scheel to 
generate quantum bistochastic maps [13]. 

2.2 Algorithm suitable for analytical calculation 

To perform analytical calculations of probability distribution inside the Birkhoff polytope we are going to use yet another 
algorithm to generate bistochastic matrix, the idea of which is due to Djokovic [14]. Already in his earlier paper [10] 
Sinkhorn demonstrated that for a given positive matrix M there exists exactly one doubly stochastic matrix B such that 
B = D L MD R . In order to extend such important result from posistive matrices to non-negative ones, one has to introduce 
the hypotesis of fully indecomposability [1 1, 14]. For the sake of clarity and reading, we prefer to mention here that the set 
of non-fully indecomposable (stochastic) matrices constitute a zero measure set within the set of all stochastic matrices, 
instead of going through the details of Sinkhorn's proof. This means that the converge of our algorithms we will assume to 
hold true from now onwards, has to be intended almost everywhere in the compact set of stochastic matrices, with respect 
to the usual Lebesgue measure. 

Here D L and D R denote diagonal matrices with positive entries determined uniquely up to a scalar factor. 

To set the notation, we will denote with R + the positive semi-axis (0, oo) whereas, the symbol R + will be used for R + U 
{0} = [0, oo) . Let us now consider the positive cone and the set of endomorphisms over it, End , representable 
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by means of N x N matrices M consisting of non negative elements my > 0. For any given two vectors L and R in R+ , 



one can consider a map F^.r e End given by 

End [R^] 3 M i— > M' = F L , R (M) e £«<i [R^] 

R+ 9 my i — > ray = F L , R (my) := L,- myfly e R+ . 



(2a) 
(2b) 



Defining the positive diagonal matrices Df. := L. 5., , and := i? ; 5y respectively, one can observe that T^r (M) = 
D L M D s . Our purpose is to design an algorithm that takes a generic M e End [R+ ] as an input and produces as an output 
an appropriate pair of vectors L,R e R+ such that F^r (M) =: B is bistochastic. 
Fhe stoehasticity condition implies 



^] By =1=2 L i m U R J ==> R j > and ^" - 2 L * m ^' ' 

/ i 1 k 

Analogously, unitality implies 

^ By = 1 = ^ Lj m, 7 Rj ==> Li > and ^- = ^ my ; , 

so that L,R€ (R + ) N c R^ . Both equations (3) can be merged together into a single equation for L , 



(3a) 



1 v 1 

— = > mi ; 

Li ^ 1 y, L, 



(3b) 



(4) 



which can be interpreted as a kind of equation of the motion for L, as it corresponds to a stationary solution of the 
action-like functional 



<D [L] = - ^ In (Ld + J] In £ L* my 



(5) 



Equations (4-5) imply that if L is a solution, then for any A e R the rescaled vector AL is as well a solution of (5). Fhus 
we may fix L N = 1 and try to solve (4) for L\,Lz,..., L#-i . Differentiating eq. (5) we get 



<90> 
dLi 



1 

Li 



-2> 



, where S y := L, my 



1 



(6) 



is a stochastic matrix. Since L; ^ 0, unitality of S is attained once we impose stationarity to (6). Hence the stationary L 
implies that S becomes bistochastic. Equation (5) displays convexity of <t> for very small L, (i = 1,2, . . .,N - 1 , Ljv = 1 )■ 
Fhe function O is convex at the stationary point and starts to become concave for large L, . Thus there is a unique minimum 
of the function O which can be reached by the following iteration procedure: 



r(n) 



1 



(7) 



7 2j L * W ^ 



where we fix L^ and iterate the remaining components L\,Li, . . . , Ljy-i only. We start with setting Lr,' = 1 , Vfc which 
leads to 
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Algorithm 2 (convergent sequences of R N vectors) 

1) take an input N xN stochastic matrix M = {»iy}.. and define the vector L (0) = (1, 1, . . ., 1) T e R N , 

2) run equation (7) yielding the vector L (n) out of L ( "~ l) , 

3) stop if the matrix S (n) := L ( " ] mtj -= — — is bistochastic up to a certain accuracy in some norm || • ||, 

£j L k m v 

k 

otherwise go to point 2) . 

The Algorithm (1) is expected to converge faster than the Algorithm (2), so it can be recommended for numerical 
implementation. On the other hand Algorithm (2) is useful to evaluate analytically the probability measure induced into 
the Birkhoff polytope by a given choice of the input ensemble, and it is used for this purpose in further sections. The 
equivalence of these two algorithms is shown in Appendix A. 

3 Probability measures in the Birkhoff polytope 

Assume that the algorithm is initiated with a random matrix M drawn according to a given distribution W[{my}] of 
matrices of non negative elements my 3s 0. We want to know the distribution of the resulting bistochastic matrices B,j 
obtained as output of the Algorithm (2). To this end, using eq. (4) and imposing stationarity condition (6), we write the 
distribution for B by integrating over delta functions 



~[dL r I ••■ I ] dm pq W[{m pq }] x[~[<5 

^° Vr=l J** J° \p,q=l I /',,/= 1 



Bjj - Li III i. 



X 



6(L N -l)]~[s 



xJ{L u L 2 ,...,L N - l } , (8) 



where the Jacobian factor reads 

J{Li,La,...,L N -\) := det 



3 2 <6 



dU dL { 



N-\ 



(N-l 



V 1=1 i ) 



li xdet[l-fifi T ], 



(9) 



Here and in the following [l - BB t ] n will indicate the (N - 1) X (N - 1) block matrix [d i( - 2^=i B U B tj[- \ > that 
is positive defined, and the symbol P[{A,-y}] will denote the probability density P of matrices A = {A, y }. This notation 
will also be used for matrices whose elements are functions of elements of another matrix, namely P[{/(A,y)}] . Plugging 
eq. (9) into (8) and introducing again the delta functions for variables Rj of (3a) we obtain 

Jr-oo poo i N \ poo poo I N \ „oo poo ( N \ N 

■ ■ ~[ dLr ■ ■ ■ ~[ dR s 8 (L N - 1) I • • • ] dm pq W[{m pq }] X ]~[ 6 (b u - L, m, 7 Rj) x 

JO \r=l J** ^° Vj=l / ^° ^° \p,a=l J U=\ 



N-l f j \ N 

~\s\-— + Y j m ut R, xY]s 

K=l V " 1 ) W =l 



h m/n 



xfjl xdetfl-^L 



I 2 



(10) 
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Using the property of the Dirac delta function and making use of the Heaviside step function 6, we perform integration 
over the variables dm pq . Introducing new variables a-, := 1 /L, and B, := 1/J?,- , so that dL, dRj i— > L 2 R 2 da, dBj , we get 



Xoo poo ( N \ poo poo f N \ N 

■ ■ ■ I "[da,, or*" 1 I ■ ■ ■ I "[4/3, jSf" 1 ["[ W [{a p B pq B q }] S(a N -l)x 
J° Vr=l )J° J° \s=l ' P.q=l 

N-l I \ N I \ N 

xdet[t-BB J ] N _j x]~[(S X I1 (5 x]~[0(B flf ). 



(11) 



/ a,c=l 



The last three factors show that B,j is bistochastic. The factor det [l - BB T ] indicates that the expression is meaningful 
only in the case for which the leading eigenvalue 1 of BB 1 is non-degenerate. 
If the matrix m, ; - is already stochastic, 



N ( \ N 

W[{m pq }] = V[{m pq }] xY]s 1 - J] m hw x ]~[ 0(m flc ) , 



(12) 



then the integration over Bj can be performed and we arrive at the final expression for the probability distribution inside 
the Birkhoff polytope which depends on the initial measure V in the set of stochastic matrices; 



poo poo i n \ N 

pmj]]= x l hk^- 1 n 



n 



V 



Y, r a r B rq 



6{a N - 1) x 



JV— 1 / 



det[t -BB j ] n _j xY\s 1-Yj B '" X Y\ 6 l ~Yu Bhw x n 0(Bflc) 



(13) 



/ a,c=l 



The above implicit formula, valid for any matrix size and an arbitrary initial distribution V, constitutes one of the key 
results of this paper. It will be now used to yield explicit expressions for the probability distribution inside the set of 
bistochastic matrices for various particular cases of the problem. 



3.1 Measure induced by Dirichlet distribution 

Let us now assume that the initial stochastic matrices are formed of independent columns each distributed according to 
the Dirichlet distribution [8, 15, 17], 

D S {X V . . „Vi) = «* K X ■ ■ ■ 4r-i(l - Aj Vi)" 1 . (14) 

where s > is a free parameter and the normalization constant reads a s = T [2s] /T [s] 2 . 

Algorithm 3 (Random points in the simplex according to the Dirichlet distribution) 

Following [ 1 8] we are going to sketch here a useful algorithm for generating random points in a simplex A^v_i 
according to the distribution (14). 

1) generate an A^-dimensional vector X, whose elements are independent random numbers jc, from the 
gamma distribution /(jc,-; s, 1) of shape s and rate 1, so that each of them is drawn according to the 
probability density x s , e~ Xi /F(s) ; 

2) normalize the vector X by dividing it by its t\ norm, X i — > Y := so that the entries will become 
Xi i — > y, := Xil 2lk=l x k ■ 
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A simplified version, suited for (semi)integer s is described in the appendix of [17]. In particular, to get the uniform 
distribution in the simplex (s = 1) , it is sufficient to generate N independent complex Gaussian variables (with mean zero 
and variance equal to unity) and set the probability vector by 



N 

yi = \zi\ 2 /J]\z t \' 2 

i=l 



Hence the initial stochastic matrix M is characterized by the vector consisting of N Dirichlet parameters s — {s\ , 
which determine the distribution of each column. 
The probability density can be written as 



where the normalisation factor reads 



V s [{m, 7 }] := Y\ D Sj {mij , m 2j , . . . , m N -y j) = N ]~~[ (m, 7 ) Sj 
j U 



i-< r('j) 

Thus one can obtain the probability distribution of the product 

V s [{a p B pq B q }] =N Y\(a p B pq /3 q ) Sq ~' =N \\ B p^' rl x \ | " 1 ^ ' x Y\ft 



N(s z -1) 



(15) 

, s N ), 

(16) 
(17) 

(18) 



and making use of eq. (13) one eventually arrives at a compact expression for the probability distribution in the set of 
bistochastic matrices 



Xco ^oo l N \ N , N 
■ ■■ "[da,. <>7 ~ ' [ ] — S(a N - 1) x ] ] . 



\ N I 



X X 



l- Jfi,, x[]n-^4, xY]e(B ac ). 



»=1 V t ) w=l V 



X 



(19) 



Although the results were obtained under the assumption that the initially random stochastic matrices are characterized 
by the Dirichlet distributions (16,17), one may also derive analogous results for other initial distributions. As interesting 
examples, one can consider the one-parameter family V s ^ [{my}], in which each y-column of M is drawn according to a 
different gamma distribution / (my; sj, A^j of shape Sj and rate A , that is 



A Ns J 



s/-l 



(20) 



or, allowing the exponents s to vary through the whole matrix, we can start with 



l'...,.,H„/, ; )|-| ||e („,, ; ) 



A s " 



(21) 



and recover (19) , independently on the rate A labeling the input. 
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3.2 Probability measures for N = 2 

In the simplest case, for N - 2 and Bij — I ^ 



l-d 



, formula (19) describes the probability measure P SuS2 (d) induced 



into the set of bistochastic matrices by the ensemble of stochastic matrices with two independent columns distributed 
according to the Dirichlet measure with parameters s\ and 52; after integration on a 2 , renaming a\ into a, and expressing 
det[l - Bfi T ] = 2d(l -d), we arrive at 



P Si , S2 (d) = N 



N=2 



f 

JO 



da a 



11+12-1 



1 


2s, 


1 


ad + 1 - d 




a(l -d) +d 



Is-, 



2[d(l -d)] s,+S2 - l 6(d) 0(1 -d) 



This expression can be explicitly evaluated for exemplary pairs of the Dirichlet parameters s\ and 52, 

(l-4r2)[(l + 4r>({±£)-4r] 



Pr U (r) = 



16r 3 



Pr3/2,3/2 (r) = 



Pri/2,1/2 W = 



(l - 4r 2 ) 2 [(3 + 8r 2 + 48r 4 ) In - 12r - 48r 3 ] 



16^ 2 r 5 



21n(|^) 



Pri/2.1 (r) = Pri.i/2 (r) = 1 



(22) 

(23) 
(24) 

(25) 
(26) 



where r - d - j. These distributions are plotted in Fig. 2 and compared with the numerical results. 

There is another important distribution that we would like to consider. We started our analysis by considering a stochas- 
tic matrix as an input state of the renormalization algorithm. However, as an initial point one can also take a generic 
matrix K whose four entries {ki 1 , £12, £21 , £22} are just uniformly distributed on some interval. After the first application 
of the half-step map T o R, (see Fig. 1) as 

hi 



K 



k u k n 
hi k 22 



ToR 



kn + ku 
kn 



K-21 

k 2 \ + k 2 2 

k 2 2 



a 

I — a 



1 -b 
b 



(27) 



\ku + kn k%\ + ki2> 

matrix K becomes stochastic, so that this problem can be reduced to the framework developed so far. 

The joint probability distribution of independent random numbers y'-, drawn according to the uniform distribution in 
one interval of R + , and then rescaled as 



y t -> y t = 



TlUy't 



(28) 



reads P(yi....y^) — 6(1 - Yttydl W [ max (yi)]} N [17]. In the simplest case, N-2,it gives p~(y) = l/2y 2 for y e (1/2, 1], 
(where y := y\ = 1 - y 2 ) and symmetrically for ye [0,1/2]. Using this and assuming independence between the entries 
of the matrix K, the distribution for the variable a and b of (27) reads 

,2 



P(a,b) :- p(a) x p(b) 



1 



( 2 max {a, I - a] max \b, 1 — b] 
Plugging the last expression into the r.h.s. of (22) we obtain (see Fig. 2(e)) 



(29) 



2(1 -2|r|) l+21n j±§^ 

Pr (d) = = l - , V — , (30) 

d+2|r|) 3 



where again r = d- 1/2. 
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© 



: Pr u to 



2.0 



1.0 



-0.4 -0.2 



0.2 0.4 r 



© 




Pri/2,1/2 to 




2 . 






1 . 5 






1 . 




0.5 





-0.4 -0.2 



0.2 0.4 r 



© 



PT3/2.3/2 to 



2.0 



1.5 



1.0 



0.5 



-0.4 -0.2 0.2 . 4 r 


© 

2 . 
1 . 5 

3.^0- 


Pri/2,1 0) = Pr u/2 to 


0.5 





-0.4 -0.2 



0.2 0.4 r 




Figure 2: Probability distribution Pr(r) in the set of N - 2 bistochastic matrices for various initial measures. His- 
tograms obtained numerically for a sample of 10 6 initial matrices by applying Algorithm (1) are compared with analytical 
probability distributions (solid lines); (a) semicircle-like (23) for Prii; (b) Gaussian-like (24) for Pr3/2 ; 3/2; (c) convex 
distribution (25) for Pri/2,1/2; (d) fiat distribution (26) for Pri/2,1 and (e) distribution (30). 
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3.3 Symmetries and relations with the unistochastic matrices for N = 2 



l-b 



1 — a 



The symmetry of this 



Consider the map T o R defined in (27) acting on an initially stochastic matrix 
map with respect to diagonal lines a - b and a — —b implies that: 

• the limit distribution Pr So(ifc (r) is an even function of r, 

• P r J„,s ft = P r Si,,j > f° r an y s a an d $b- The final accumulation point d e [0, 1] can be achieved from the point (a, b) as 
well as from (b, a). 



In particular the second point implies that if Pr Sa Sb (r) is the output probability density when the (a, ^-distribution is given 
by P Sa ,s t (a, b) and if s a + s\, then for any given A e [0, 1] the distribution A P Sa ,s b (a,b) + {I - A) P Sb ,s a {a, b) will give the 
same output. Using this we can restore the symmetry between (a, b) simply by picking A = 1/2. 



(a, b) := -P l/2A (a, b) + -P u/2 (a, b) 



1 



1 



2n V«(l - a) In y/b(l-b) 



(31) 



is a symmetric distribution for a and b which produce, at a long run, the uniform distribution P(d) = 1 . Note that the above 
formula is not of a product form, so the distribution in both columns are correlated. In fact such a probability distribution 
can be interpreted as a classical analogue of the quantum entangled state [8, 16]. 
Random pairs (a, b) distributed according to distribution (31) can be generated by means of the following algorithm, 

1) generate the number a according to T>\ji (a) and b according to D\ (b) 

2) flip a coin: on tails do nothing, on heads exchange a with b 

For N — 2 there exists an equivalence between the set of bistochastic and unistochastic matrices [20]. The latter set is 
defined as the set of 2 x 2 matrices whose entries are squared moduli of entries of unitary matrices. The Haar measure on 
U(2) induces a natural, uniform measure in the set of unistochastic matrices: if U is random then P(\ Ui 1 1 2 ) = P(\ C/22I 2 ) = 1 
on [0, 1]. Hence initiating Algorithm (1) with stochastic matrices distributed according to eq. (31) we produce the same 
measure in the set of bistochastic matrices as it is induced by the Haar measure on U(2) by the transformation Bij = \ Uij\ 2 . 

4 In search for the uniform distribution for an arbitrary iV 

For an arbitrary we shall compute the probability density at the center of the Birkhoff polytope, 



B* N 



i_ 1 

N N 
N N 



1 \ 

N 

,V 



N N 



(32) 



Let us begin our analysis by expanding P s [{B,y}] around the center B* (32) of the Birkhoff polytope. We start from (19) 
with N given by equation (17) , so that 



jv— 1 



P,[{B U }] = P s [{B u }] xY\6 1-2 B M x\\s 1 - 2 Bhw X \\ 6 



(33) 
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and 



] B,,/"- 1 x det[l-BB T ]^ 



6{a N - 1) x 



lN-i 



(34) 



on the manifold 2, B„, = £ A B hw = 1 , B ac > . 

4.1 Expansion of probability distribution around the center of the poly tope 

Expanding P s [{B,j}] in power of SBjj with 

i y 

we obtain, as shown in Appendix B, the following result 

!/ 2 \ 3 3 { \ 2 ) 

1+ T" 1 Z( OT «) 2 "2(^)Z^( M «) 2 + 2(^Ti)Z E*®- +0 ( (l5B)3 ) • 
\ 1 pq pq p \ q ) j 



(35) 



where cr = Ylj=\ s j denotes the sum of the Dirichlet parameters for each column and the factor 



P*:=P s [{B tj =l/N, Vy}] = N' 



T(NZ m s m ) l „ 



(36) 



(37) 



is equal to the value of the probability distribution at the center of the polytope Bn, which corresponds to 6B = 0. 

Assume now that there exists a set of Dirichlet exponents s,, such that P s [{B /)? }] is constant on the required mani- 
fold (35) . Then the quadratic form in 6B pq must be identically zero. For N — 2 this yields only one equation for two 
exponents, 2s\ + 2so + 1 = %s\S2 , which can e.g. be fulfilled by si = 1/2 and S2 — 1 (compare with Section 3.3). 

For N ^ 3 , however, this gives more independent equations, in general (N - l) 2 , namely the number of independent 
variables parameterizing the Birkhoff polytope. Being the number of exponents to be determined, if a solution exists, 
then it is unique. Actually the solution exists , and corresponds to take all s, equal to each other: let's call s this collective 
exponent. Within this constraint, the last term in (36) drops out, because of equation (35) , and the entire quadratic form 
can be zero, provided that we choose 

(— - 1) = ,* . (38) 

Now, setting cr = Ns*, we arrive at s* 2 = (n 2 s* + l)(N 2 - l), whose unique positive solution is 

,* = ^_ 2+ V^ = i-^-^ + 0(i,). 



(39) 



The distribution generated by the choice s — s* will be fiat at the center of the polytope but it needs not to be globally 
uniform. 

It is not possible to find an initial Dirichlet distribution which gives the output distribution uniform in the vicinity of the 
center of the Birkhoff polytope up to the third order — see Appendix C. 
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4.2 Numerical results for N = 3 

Properties of the measures induced in the space of bistochastic matrices by applying the iterative Algorithm (1) were 
analyzed for N = 3 . As a starting point we took a random stochastic matrix M generated according to the Dirichlet 
distribution (14) with the same parameter for all three columns, s\ = s% = S3 = s. The resulting bistochastic matrix, 
B = limn-joo I1"(M), can be parameterized by 





fin 


B12 


* 


B = 


B21 


B22 


* 




* 


* 


* 



where the *-marked entries depend on the entries B^, with j, k e {1,2}. A sample of initial points consisted of 10 8 
stochastic matrices generated according to the Dirichlet distribution with the optimal value s* — -k(7 + V77) which 
follows from eq. (39)). It produces an ensemble covering the entire AD Birkhoff polytope formed by the convex hull of 
the six different permutation matrices of order three. 




i 1 
B11 



Figure 3: Probability density at a subset of the Birkhoff polytope for N — 3, the "fat" hexagon characterized by 
[fin , B\2 , \ ± 0.01 , I + 0.01 J , for initially stochastic matrices generated with the Dirichlet parameter s* given by 
eq. (39). 



To visualize numerical results we selected the cases for which B21 = B22 = 1/3 ± 0.01. Such a two dimensional 
cross-section of the Birkhoff polytope has a shape of a hexagon at the plane (B\i,Bvi), centered at the center of the body, 
B* = [ | , i , i ; . . . , |J. Figure 3 shows the probability distribution along this section, obtained from these 4 x 10 6 
realizations of the algorithm which produce bistochastic matrices inside a layer of width 0.02 along the section. 

As expected for the critical value s* of the Dirichlet parameter, the resulting distribution is flat in the vicinity of 
the center of the polytope. However, this distribution is not globally uniform and shows a slight enhancement of the 
probability (darker color) along the boundary of the polytope. 

This feature is further visible in Fig 4, which shows a comparison of the results obtained for two different initial 
measures on a one-dimensional cross section of Fig. 3 . Although the measure obtained for the critical parameter s* 
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is indeed uniform in the vicinity of the center, namely around B\\ = 1/3 , the measure induced by random stochastic 
matrices with the flat measure, s = 1, displays similar properties. Since for larger matrix size N the value of the optimal 
parameter s* tends to unity as 1 - 1 /TV 2 , it seems reasonable to generate random bistochastic matrices of a larger size 
initiating the iterative Algorithm (1) with random stochastic matrices distributed according to the uniform measure, (i.e. 
each column is generated independently according to the Dirichlet distribution with 5=1). 

2.0 
1.8 
1.6 
1.4 
1.2 
P 1.0 
0.8 
0.6 
0.4 
0.2 

1/3 2/3 

e i1 

Figure 4: Probability density along the line B\o = \ of Fig. 3 obtained from 5 x 10 3 events for two initial measures: (a) 
the critical parameter s — s* (marked by + and decorated by a solid line to guide the eye) and (b) the fiat measure 5=1 
(marked by O) . 




5 Estimation of the volume of the Birkhoff polytope 

The set !B N of bistochastic matrices of size N forms a convex polytope in R^ 1 '". Its volume with respect to the Euclidean 
measure is known for N - 2 , 10 [5,6]. The concrete numbers depend on the normalization chosen. For instance, in 
the simplest case the set 2?2 forms an interval d e [0,1], any point of which corresponds to the bistochastic matrix, 
B(d) = ( l d d If the range of the single, independent element is concerned, the relative volume of the polytope reads 
v(2?2) = 1. On the other hand, if we regard this set as an interval in R 4 , its length is equal to the volume of the Birkhoff 
polytope, Vol (S2) = V4 = 2. In general, both definitions of the volumes are related by [12] 

Vol(Sjv) = N N - 1 v(S N ) . (40) 

In Section 3 we derived formula (37) , giving the probability distribution Pt at the center of the Birkhoff polytope 
induced by the Dirichlet measure on the space of input stochastic matrices. If all Dirichlet parameters are equal to 5, = s 
for ; = 1, .. .N then formula (37) simplifies to 

r(N 2 s) r( s f 
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Making use of the Stirling expansion 



* V27T X 



-1/2 „-> 



and plugging it into eq. (41) we obtain an approximation valid for a large matrix size N 
P* 



N Nl - N {2n) N - l l 2 s sNl - N+l l 2 [T{s)Y Nl exp\-sN 2 + -+0 - 



6s 



N 



(42) 



(43) 



For s = s* = 1 — 1 IN + 0(1 /N ) this distribution is flat in the vicinity of the center - compare eq. (39) . Assuming 



it is close to uniform in the entire Birkhoff polytope, we obtain an approximation of its relative volume, v (£ N ) 
Substituting s* into (43) we arrive at 



l/P 



AT 



v(S N ) * N N - Nl {2nfl 2 - N exp jjV 2 + C + o(~J 



(44) 



Making use of the expansion T(\ + x) = 1 - yx + 0(x 2 ) we can express the value of C by the Euler gamma constant 
7* 0. 577 215 665 . . . The result is C = y - 1/6 ~ 0. 410 548 998 . . .. 

Interestingly, the above approximation is identical, up to a value of this constant, with the recent result of Canfield and 
Mackay [12]. Making use of the relation (40) we see that their asymptotic formula for the volume vol(S^) of the Birkhoff 
polytope is consistent with eq. (44) for C = 1/3 . This fact provides a strong argument that the distribution generated 
by the Dirichlet measure with s — s* , is close (but not equal) to the uniform distribution inside the Birkhoff polytope. 
Furthermore, the initially flat distribution of the stochastic matrices, obtained for s = 1, leads to yet another reasonable 
approximation for the relative volume of £ N , equivalent to (44) with C = -1/6 . 



6 Concluding Remarks 

In this paper we introduced several ensembles of random stochastic matrices. Each of them can be considered as an 
ensemble of initial points used as input data for the Sinkhorn Algorithm, which generates bistochastic matrices. Thus 
any probability measure W[M] in the set of stochastic matrices induces a certain probability measure P[B] in the set of 
bistochastic matrices. 

Let us emphasize that the iterative procedure of Sinkhorn [10] applied in this work, covers the entire set of bistochastic 
matrices. This is not the case for the ensemble of unistochastic matrices, which are obtained from a unitary matrix by 
squaring moduli of its elements. Due to unitarity of U the matrix = | E/y is bistochastic, and the Haar measure on 
U(N) induces a certain measure inside the Birkhoff polytope [20]. However, for N > 3, this measure does not cover the 
entire Birkhoff polytope since in this case there exist bistochastic matrices which are not unistochastic [1,20]. 

In the general case of arbitrary Af we derive an integral expression representing the probability distribution inside the 
(N - l) 2 -dimensional Birkhoff polytope Sn of bistochastic matrices. In the simplest case of = 2 it is straightforward 
to obtain explicit formulae for the probability distribution in the set of bistochastic matrices induced by the ensemble 
of stochastic matrices, in which both columns are independent. Furthermore, we find that to generate the uniform (flat) 
measure, P[B] = const, one needs to start with random stochastic matrices of size 2 distributed according to eq. (31), for 
which both columns are correlated. 

For an arbitrary the integral form for the probability distribution can be explicitly worked out for a particular point 
— the flat, van der Waerden matrix (32) located at the center of the Birkhoff polytope. In this case we obtain an explicit 
formula for the probability distribution at this point as a function of the parameters [s, | 1 < i < AO defining the Dirichlet 
distribution for each column of the initially random stochastic matrix. Expanding the probability density in the vicinity of 
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Bt we find the condition for the optimal parameters s,- = s* , for which the density P[B] is flat in this region. Discrepancy 
of the measure constructed in this way from the uniform distribution is numerically analyzed in the case N - 3. 

This measure is symmetric with respect to permutations of rows and columns of the matrix and for large N it tends to 
the uniform measure in the set of bistochastic matrices. For large N the optimal Dirichlet parameter s* tends to unity 
as 1 - l/N 2 . Thus we may suggest a simplified procedure of taking the initial stochastic matrices according to the flat 
measure, (s = 1). Each column of such a random stochastic matrix is drawn independently and it consists of N numbers 
distributed uniformly in the simplex A#_i. With an initial matrix constructed in this way we are going to run Algorithm 
(1). Such a procedure is shown to work fine already for N - 3. We tend to believe that this scheme of generating random 
bistochastic matrices could be useful for several applications in mathematics, statistics and physics. 

Assuming that a given probability measure in a compact set is flat, the value of the probability density P at an arbitrary 
point x gives us an information about the Euclidean volume of this set, V = l/P(x) . We were pleased to find that the 
optimal algorithm for generating random bistochastic matrices is characterized by an inverse probability 1 /P S [B^] at the 
center B^ of the polytope which displays the same dependence on the dimension N as the volume of the Birkhoff polytope, 
Vol(Sjv), derived in [12]. 

Although in this paper we analyzed dynamics in the classical probability simplex, the main idea of the algorithm may 
be generalized for the quantum dynamics. In such a case a stochastic matrix corresponds to a stochastic map (so called 
quantum operation), which sends the set of quantum states (Hermitean, positive matrices of trace one) into itself [8]. 
A quantum stochastic map is called bistochastic, if it preserves the maximally mixed state, 1 /N. To generate random 
bistochastic maps one can use an analogous technique of alternating projection onto the subspaces in which a given map 
or its dual is stochastic. Such an algorithm suitable for the quantum problem, was proposed independently by Audenaert 
and Scheel [13]. First results concerning various measures induced into the set of quantum stochastic maps are presented 
in [25]. 
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Appendix A 

In this appendix we demonstrate that the Algorithm (2) suitable for analytical calculations is equivalent with the Sinkhorn 
Algorithm (1). 

To apply the former Algorithm (2) one takes some initial matrix M e End [R+ J and makes it bistochastic by means 
of left- and right-multiplication by two matrices D L , and D R . The latter are limits of convergent sequences of diagonal 
matrices D L = lim„ and D R = lim„ D R and the finally B = D L MD R . 

In a similar way, Algorithm (1) performs the same task of transforming the initially stochastic matrix M e End into 
a bistochastic matrix B by alternating rows- and columns-normalization (R and C, for short), which in turn is the same 
of left-, respectively right-multiplication by diagonal matrices. Once a matrix M = \m pq > 0} is given to renormalize the 
p th row means to divide each of its elements by the factor £ 9 m pq , 



m 



pq 



m 




with 




(45a) 
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Analogously, to renormalize the q th column means to divide each of its elements by the factor £ p m pq > 



pq 



m pq R q 



with 



(45b) 



Let us now ran the Algorithm 1, taking as an input a generic M (0) = {mfl > 0), and set 1CRCRCR. .. to be the 
row-column renormalization sequence, where the first symbol 1 denotes the dummy operation 



1 



„(0) 

pq 



JO) _ f(0)„,(0) 



pq 



L p m pq 



(46a) 



Now we start with equations (45b) 



J_ _ V ™(0) _ V f (0) 
fi(0) _ Zj pi ~ Zj p 



(0) 

pq 



v ? p 
followed by (45 a) 



l pq pq q 



t (0) „,(0) ft(0) 
u p " l pq n q ■ 



(46b) 



_L _ y _ y f (0) (0) wo) 
? (i) ~ 2-i pi 2-i p pq q 



p q q 
The next two steps are 



^=Z<=Z l p 1) < 



(!) 



m 



m pq 



,(2) 



L (1) m (1) 



" l pq n q 



L^Lfmf q RfR^ 



(46c) 



(46d) 



and 



J_ _ y _(2) - y f (1) f (0) m (0) S(0) A 

f(2) ~ 2-i pq ~ 2-i p p m pq K q ■ 
p q q 

so that the iteration procedure can be written as 

1 



q 



f(n) f (n-1) nij rt.i 
> ^P ^P ^P 

1 



r(l)f(0) 



ft(n)6(n-l) _ _ _ 6(1)6(0) 

The latter form can be rewritten more compactly, 



= V m (0 > /? (0) /? (1) ■ 
/ 1 pq p q 



Zf(»)f(»-D . . . f(l)f(C 
^p ^p ^p ^p 



(0) (J)) 
p? ' 



,10) 



z ^ 



(46e) 



(47) 



(48) 



where we introduced new variables 



Lto^Y\L? and Rf>:=Y\R ( ?. (49) 
t=\ e=\ 
Equation (47) is formally equivalent to (7), the only difference being in the number of component of L vectors, respectively 
L, that are processed: in Algorithm (1) one iterates all L " , whereas in Algorithm (2) the element L " is fixed to unity in 
each step. We know that the solution of the limit equation for L ( "' is not unique. But the only non-uniqueness is due to 
multiplication by a fixed factor 77 > 0. 
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Appendix B 

In this appendix we present the basic steps allowing one to derive the central result of this work - the second order 
expansion (36) around the center of the Birkhoff polytope of the probability distribution generated by Dirichlet random 
stochastic matrices. 

Since P S [{Z?, ; }] > 0, it is convenient to expand lnP s [{Z?,j}] . We denote the sum of the Dirichlet parameters for each 
column by <x = Ylj=\ s j an d start with the following integral 



poo poo i n \ N 

Gs[{Sy}]:= Jo'-l fK< -1 n 



lip , n Ns n . 

u=i [Zh <*h jj + Tih a h o~B hw \ 



6{a N - 1) 



■■■ Idarar 1 
Jo Jo \ L r= f ) 



Expanding the function ]n(l+x)-x-^-+0 (x 3 ) , 



Ntr 



exp 



■J] Ns w In 



u=l 



1 + 



2/, a h SB hi 
2/i a h Jf , 



6{a N -l). (50) 



exp 



- J] Ns w In 



1 + 



2/i a h SB hy 



2/i &h Jf ) 

and then e~ x « 1 — x + x 2 /2 we get 



= exp 



n = 1 



2/< Oh SBfg 

2/i ff/i 77 , 



0((6B) 3 ) 



(51) 



Q s [{B ij }]=N Nlr ••• Ida.ar 1 
Jo Jo ^ r= f J 



;V.r 



5(a w - 1) (, sr2 2 



2/1 a/i 
at/2 2a ah 6B h 



0((SBf) . (52) 



Thus we have to integrate the following expression for an arbitrary vector of parameters § w > 



Jo Jo l;. = f J 



X 5(un - 1) py 
(2/, »/i) ,,=T 



2/, 



i /-»CO /-»oo / A/" \ s-*oo 

■■■ Idarar 1 a*c£-ap^ 5(a N -\)\ dt e"'^ t N ^ m ~ l , (53) 

f + m) Jo Jo I if j Jo 



r(<x/y 

Here m := &i + §2 + • • • + &n-\ + , so the integral reads 



/ = 



r(cr + #i) r(o- + § 2 )---T(cr + ^ N - l ) T(cr + § N ) T (cr) 



r(o-A0 y~[ r ( cr + )? ./) 



r (cr/Y + m) 



with 



n u? 



y 1 2/, o» / / ' T (cr/V + 2/ 0/) y r (cr) 



n 



(54) 



(55) 
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This expression, completely symmetric in all variables aci , a% . . . a^-i , un allows us to calculate the expansion of the 
integral (52) : 



Qs[{Bij}] = N 1 



r(tr-) 

T(crN) 



1 - N 



r2 IZw s w Yih a hSBh 



Zh a i, 



+ -z 

w 



l^rK P^ H' <56) 



Therefore we need 



Zh a h 



(Zh 

a„a„ 



r(cr+ 1) r(crA0 cr _ 1 

T(o-) ' T(o-N+l) ~ crN ~ N 

r(cr + 2) T(crA0 cr(0-+l) cr + 1 



Thus the second term in (56) is 



r(cr) T{crN + 2) o-N(o-N+l) N(crN+l) 
T(cr + l)T(cr + l) r((rN) cr ■ cr cr 



F(cr) r(cr) r(crJV + 2) crN {crN + 1) N (crN + 1) 



( ^ ) = — > . ^fi/nr s» = , because of (35) • 



For the third term we have 



= Z '* Z Z (-^r?) ^ + Z '* Z (6B " w) 

w h h'*h \ \Lh a h) I w h * Khh a h) I 

l{crN+ 1) 



= Z ttth?-^ Z ^ Z SB ^ + Z ^ ^ Z 

N(crN + 1) 4"* ^ ^ N(crN + 1) 4-* 



and using from (35) the relation Zuth SBhtw = —6Bi, w we get 

(o- + l) 



/=Z **Z (5B/,h ' )2 ( 



N(crN+l) N((rN+l)) N(crN+\) 



I = 1 77 Z s » Z (^h-) 2 

/ iv(o-jv+ 1) 4^ v 



For the fourth term we obtain 



(57) 



(58) 



(59) 



Zh a h Zw SySBfo 

Zh a h 



1 



N(crN+ 1) 



2] ^ s w 6B hw 



Finally, expression (56) yields: 



Q.[{Bij}]=N 



Mr T(cT) iV 



1 + JWTT) Z ** ^ + J^TTT) Z [Z * * B -J + (^) 

v 7 hw h V w / 



yV 3 



F(crA0 2(crJV+l) 



(60) 



(61) 
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In principle we are able to calculate all higher terms. There are two other terms to be expanded: n^ 0= i Bpq Sq 1 an d 



det 1 - BB 1 . For the latter we have 

L J A'-l 



det [l - BB 1 ] = det 

L Jat-i 



;'=l v 



= exp •! In det 



A'-l 



D ik -J^6B u 6B kj 



7=1 



(62) 



A'-l; 



In the last line we made use of equation (35) and we introduced the (N - l)x(N - 1) circulant [24] matrix D, k := 6% -l/N . 
As it can be verified by direct matrix multiplication, the inverse of D reads Dt} = 5 jk + 1 . Hence, factorizing the determinant 
of the product in the product of determinants, it follows from (62) 



det [l - BB 1 ] = det \d] x det 

L 1n-\ I 1n-i 



A'-l N 



7=1 k=l 



(63) 



A'-l 



Observe that the index j labels the (N - 1) columns of the matrix [-D -1 ] j , whereas k runs from 1 to N , since we are 
going to consider [{SB) (6B) t \ n ^ and not [(^B)]^ ^ [(tf.B) 1 ]^ [ . Using the property of circulant matrices [24], we can 
determine the spectrum of D , consisting of a simple eigenvalue 1 /N and another one equal to 1 , of multiplicity (N -2). 
Thus det [d] = l/N and eq. (35) and (63) yield 



j « A 

det [l - BB j ] n ^ = - x det 5 it - ^ 5B ik 6B [k + 6B Nk 6B ek 

I k=l k=\ 

From the identity det [exp (A)] = exp [Tr (A)] , with the substitution A <— log (1 + X) we get 
det (1 - X) = exp {Tr [log (1 - X)]} = exp {Tr \-X + O (x 2 )]} 



(64) 



A'-l 



= exp {-Tr (X) + O [Tr (x 2 )]} = 1 - Tr (X) 



[Ti(X)f 



■0[Tr(x 2 )} 



so that, choosing for X the (5B)'s contributions in equation (64) , we get Tr (X) = 2a*=i SBc k 5B( k and therefore 



H 1 - jf j 1 - Z (m k ) 2 + o((6B)^ . 



(65) 



Finally we use the expansion 



<5B, : 



1 



P9 / ( yyN(o--JV) 



1 + y Z K*) 2 - t Z ^K ? ) z +o((/v^) 3 )| 



2 A' 



x2 yV- 



2 W 



p,q=l 



(66) 



Now, substituting (61) and (65-66) into (34) , we obtain the final formula for the resulting probability distribution around 
the center of the Birkhoff polytope & N given by ((36)). 



Appendix C 

In this appendix we provide the third order expansion of the probability distribution P S (B) at B = B+. The result ob- 
tained implies that it is not possible to find an ensemble of stochastic matrices characterised by the Dirichlet distribution, 
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which induces a distribution flat up to the third order at the center of the Birkhoff polytope. Furthermore, we provide an 
estimation, that is how the asymmetry of the optimal distribution around B* changes with N. 

For general s the output distribution behaves like P. s [{B,j}] °c exp Yj Pq {dBpq) j at the center, with 



A ~ 2 2(N 2 s+l) 



(67) 



From now on, symbols like P s , P s , V s , W s denote the probability densities obtained from the input described by the 
string 

s - { s\ = s, S2 = s, . . . , s N = s } consisting of N Dirichlet exponents equal. Since gj < , the distribution is Gaussian for 
s > s* . 

In order to study the deviations from the Gaussian distribution, we now study the third order contribution to P s [{B,y}] of 
eq. (34) , in the case s, = s . Under the latter hypothesis, many terms of the kind £ 9 s q 8B pq vanish for (35) so such terms 
will be omitted. 

The distribution (34) can be factorized into a product of three factors: 
• n^ 9 =i Bpq gives a contribution 



1 



(68) 



det [l - j gives no 3 rd order contribution (just the overall factor 1 /N already present in (65)) ; 

the integral Q s [[Bij]\ of (50) gives 



= N 



Ncr 



T(crN) 



3 2-i 



0((6B) 4 )\ , 



(69) 



where we made use of the symbol (■) introduced through eqs. (53-55) . 

Using the same reasoning as in Appendix B, including now the new contributions (68-69), we arrive at the 3 rd order 
contribution for , 



A 3 P S [{B, 7 }] =P*M(s-l)t (SB„f - ^ J] 

I P.9=l j 



2/ at SBij 
Zk a k 



(70) 



The last term reads 



'iZiOiSBij 



Zk a k 



a i a 2 QT3 



^ 6B pj 5B V j5B T j + 'i 



(Lk a k) 3 i ; 



\ (Lk a k) I u 



(71) 
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It foUows from (35) , that ^ SB Tj = = Z 

5B T j + <5B W - , so 2t#/j <^t/ - Multiplying this equality by {SB^ and 

summing over one gets 

2 (SBn) 2 5B Tj = - J] (tfB w -) 3 . (72a) 



Similarly 



2 «m/ = = Z SB * 6B *i = Z SB y j SB Tj + 3 £ (SB^f SB Tj + £ (<5B W ) 3 



and using (72a) we arrive at 



Substituting eqs. (72) into (71) one obtains 



Tji a i5Bij 



Zk a k 



a\a>2 



(Lk a k) I \ (Lk a k) I \ (Lk a k) 



+ 2 



a\ ao a3 



Now we use (55) 



Iik a k 



( r(o- + 3) 3 r(o- + 2)r(o- + i) | 2 

\ T(cr) [T(cr)] 2 
N(Ncr + l)(Ncr + 2) Z 



r(o- + i) 



r(o-) 



r(A^cr) 



r(Mr + 3) 



Thus, from (70) , the third order contribution to P S [{B, ; }] is 

A 3 ?,[{By}] = P N 



*((s-l)N 3 2N 3 s 



'cr+l)(Ncr + 2) fZ( 5 ^) 



and, near the center B* , P s [{By}] has the following structure: 



P,[{Btj}] = P* exp J -c 2 J] (5B M ) 2 - c 3 J] (<5B W ) 3 + (9 [(<5B) 4 ] 



P9 pq 

Assuming that s, = s (so cr = iVs) we may then find from (36) the value of the constant c-i, 

N 2 cr 2 N 2 
c 2 = 1 - — + 



2 2(<xjV+l) 



(72b) 



(73) 



(74) 



(75) 



(76a) 
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Similarly eq. (74) implies that the third constant reads, 




(No- + I ) (No- + 2) 



2a 



) 



(76b) 



Adjusting s = cr/N appropriately to the size N of the matrix one may find such a value of the Dirichlet parameter s that 
C2 or C3 are equal to zero. However, if we set C2 to zero, the parameter C3 is non zero, so the third order terms remain 
in eq. (75). Thus we have shown that it is not possible to find an initial Dirichlet distribution which gives the output 
distribution uniform in the vicinity of the center of the Birkhoff polytope up to the third order. A power expansion of C3 
gives 



Thus the scale of the asymmetry is SB cc N ^ 3 so it cannot be seen for | SB \ < N ^ 3 that means if I SB/B^ I < N 2 ^ . 
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